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(57) Abstract 

The invention relates to an improved expression system for newly introduced genes in yeast and comprises a yeast 
regulon, and preferably a transcription terminator, derived from one of the GAPDH genes of 5. cerevisiae. The about 850 
nucleotides long GAPDH regulon described proved to be almost ten times as effective as smaller regulons described previ- 
ously. Said regulon and/or terminator can be introduced into yeasts either as part of plasmide molecules or by incorporat- 
ing into the yeast genome. Vectors containing the expression system preferably comprises an autonomously replicating se- 
quence derived from K. lactis (KARS) or an origin of replication .originating from the S. cerevisiae 2 micron yeast piasmid. 
After transformation of yeasts, in particular of the genera Kluyveromyces and Saccharomyces, with said vectors, the yeasts 
transformed can produce foreign proteins more effectively. In the case of thaumatin the presence of a signal polypeptide 
(the pre-part) appears to be essential for expression in yeast DNA sequences encoding other signal polypeptides' are also 
described. The use of codons preferred by yeast in both structural genes, preferably those encoding thaumatin-like proteins 
and chymosin-like proteins, and DNA sequences encoding signal polypeptides is also described. 
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IMPROVEMENTS IN THE EXPRESSION OF NEWLY INTRODUCED 
GENES IN YEAST CELLS 



The present invention relates to improvements* in the 
expression of newly introduced genes in yeast cells. 

In particular the invention relates to a DNA sequence 
5 capable of initiating transcription by yeast RNA poly- 
merase II which includes at least part of the regulon 
region of one of the GAPDH genes of S. cerevisiae . 
{DNA = DeoxyriboNucleic Acid; RNA = RiboNucleic Acid; 
GAPDH - GlycerAldehyde-3-Phosphate DeHydrogenase; 
10 mRNA - messenger RNA) . 

The invention further relates to Lhe use of a DNA 
sequence capable of both termination of the tran- 
scription by yeast RITA polymerase II and effecting 
15 polyadenylation of the mRNA, which includes at least 
part of the te rminat ion/poly adenylation region be- 
longing to one of the GAPDH genes of S. cerevisiae * 

The invention also relates to a larger rDNA sequence 
20 which contains at least the above-indicated regulon 

region of one of the GAPDH genes of S. cerevisiae and 
one or more structural genes different from the GAPDH 
genes of S. cerevisiae , which DNA sequence can be in- 
serted into a recombinant DNA plasmid or into a yeast 
25 chromosome in order to transform yeasts so that they 

become able to produce the desired proteins encoded by 
the structural genes. 

Finally, the invention relates to a process for pre- 
30 paring a protein by cultivating a yeast containing the 
above-mentioned larger rDNA sequences under conditions 
whereby the protein is formed and isolating the protein 
from that yeast culture, as well as the proteins pro- 
duced by such a process. 
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BACKGROUND OP THE INVENTION 



Developments in recombinant DNA (= rDNA) technology 
have made it possible to isolate or synthesize specific 
5 genes or portions thereof from higher organisms such as 
man, animals and plants, and to transfer these genes or 
gene fragments to microorganisms such as bacteria or 
yeasts- The transferred gene is replicated and propa- 
gated as the transformed microorganism replicates. As a 

10 result, the transformed microorganism may become en- 
dowed with the capacity to make whatever protein the 
transferred gene or gene fragment encodes whether it is 
an enzyme, a hormone, an antigen, an antibody, or a 
portion thereof. The microorganism passes on this capa- 

15 biiity to its progeny, so that in effect, the transfer 
has resulted in a new microbial strain, having the 
described capability. 

A basic fact underlying the application of this tech- 
20 nology for practical purposes is that DNA of all 

living organisms, from microbes to man, is chemically 
similar, being composed of the same four nucleotides. 
For example, the same nucleotide sequence which cofss 
for the amino acid sequence specifying preprochymosin 
25 in stomach cells of newborn mammals, will, when trans- 
ferred to a microorganism, be recognized as coding for 
the same amino acid sequence. 

The basic constituents of the recombinant DNA tech- 
30 nology are formed by: 

i) the gene encoding the protein of interest, 
ii) a vector (plasmid) in which the new gene has to 
be inserted to guarantee stable replication and a 
high level of expression of the gene. 
35 iii) a suitable host microorganism in which the vector 
carrying the new gene can be introduced. 
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Depending on the nature of the protein to be syn- 
thesized, the industrial application of this protein 
and the technically possible fermentation and puri- 
fication proced'ti'fes , the plasmid vector and the host 
5 organism have tp : ,be selected. In most cases the selec- 
tion of a host organism which is unsuspected with 
regard to the production of toxic substances r i.e.. a 
microorganism mentioned in the GRAS (Generally 
Recognized As Safe) list, will be highly important. 

10 However, only ve^y few of these GRAS-microorganisms 

meet all requirements asked for by the application of 
either the recombinant DNA or the fermentation tech- 
nology. A select-ion on the basis of these positive 
criteria shows \hat, at this moment, certain yeast 

15 species, notably .Saccharomyces, Kluyveromyces, Debaryo- 
myces, Hansenul'a, Candida, Torulopsis and Rhodotorula, 
can be regarded-«as very promising host organisms for 
genetically engineered DNA molecules. 

20 In the present invention use is made of recombinant 

DNA, molecular, biological and chemical techniques to 
construct plasmid vectors that can be stably maintained 
within yeasts and, most importantly, contain the appro- 
priate regulons to bring about a high level of ex- 

25 press ion of newly introduced genes. 

Several plasmid' vectors are known nowadays which can be 
used for the transformation of the yeast Saccharomyces 
cerevisiae [C.P.Hollenberg, Current Topics in Microb. 

30 and Immunol. 96, 119-144 (1982) and A. Hinnen and B. 
Meyhack, Current Topics in Microb. and Immunol. 96 , 
101-117 (1982)]. These vectors rely on either autono- 
mous replication sequences (ARS) isolated from the 
chromosomal DNA of particular yeast species or the 

35 replicating ability of the 2 micron DNA plasmid found 
in Saccharomyces cerevisiae to maintain the vector and 
the inserted gene within the host cell. Additionally 
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these yeast vectors contain a marker by which trans- 
forraants can be selected. Examples of such markers are 
the leu 2 gene [A. Hinnen et al., Proc.Na-tl. Acad. Sci. 
USA, 75, 1929-1933(1978)], the trp 1 gene [K.' Struhl et 
5 al. f Proc.tfatl. Acad. Sci* USA, 76, 1035-1039(1979)], the 
lactase gene [R.C. Dickson, Gene, 10, 347-356 (1980)], 
and genes which confer resistance of the host cell 4 
against certain antibiotics. 

10 The stability of AR-sequences in S. cerevisiae or (K)AR 
sequences in Kluyveromyces lactis or Bs- fragilis is not 
always sufficient for the development of a reliable 
fermentation process using these yeasts * Therefore, 
integration of the foreign structural gene into the 

15 chromosome(s) of the new host cell can be very impor- 
tant for the industrial application of rDNA-containing 
yeasts . 

Seen from an economic point of view not only the 
20 stability of the inserted gene within the yeast cell is 
important but also the efficiency with which this gene 
is expressed as protein. Based on todays knowledge, the 
main routes to achieve high levels of expression of a 
newly inserted gene include: 

25 

- integration of the structural gene downstream of a 
promoter (RNA-initiation). site. which can effect a 
high transcription frequency of the gene. Ideal is 
that the promoter activity is inducible, i.e. can be 

30 switched on or off depending upon a temperature shift 

or the presence of an inducer in the growth medium. * 
Potent promoters operating in yeasts are those 
responsible for transcription of the genes encoding ^ 
glycolytic enzymes. Experimental work done by Maitra 

35 and Lobo [J.Biol.Chera. , 246, 489-499 (1971)] suggests 

furthermore that some . of these promoters are highly 
inducible. However a serious difficulty with regard 
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to the isolation of such promoters is that up to now 
little is known about the nucleotide sequences which 
confer regulation and full promoter activity on the 
DNA fragment 

5 

- integration of an RNA (RiboNucleic Acid) termination 
signal downstream of the structural gene. Owing to 
the presence of such a termination signal, tran- 
scription of the structural gene cannot interfere 

10 with the transcription of adjacent operons. Moreover 
transcription seems to be more efficient [K.S. Zaret 
and F. Sherman, Cell, 28, 563-573 (1982)] and poly- 
adenylation of the messenger is likely to occur 
correctly, resulting in a more stable mRNA (messenger 

15 RNA) population. However, up to now exact data on 

nucleotide sequences required for termination of 
transcription in yeasts are not available. 

- the presence of a nucleotide sequence flanking the 
20 AUG codon in the RNA-molecule which is optimal for 

protein synthesis initiation. According to published 
data CM. Kozak, Nucleic Acids Res. 9, 5233-5252 
(1981)] the positions -3 and +4, N X X A U G N are 
highly conserved (A or C at ->3 and G at +4), which 

25 observation suggests a role for these nucleotides in 

the recognition of the AUG codon as a translation 
start point by the ribosome. Although one might 
expect an efficient yeast promoter to contain either 
an A or C as nucleotide at the -3 position, the 

30 nucleotide at the +4 position forms part of the 

coding sequence and is therefore dependent upon the 
nature of the gene to be inserted downstream of the 
promoter. This indicates that it will be difficult to 
fulfil this condition in all cases. 

35 

- the copy number of the vector within the host cell. 
In most cases high copy numbers will lead to higher 
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mRNA levels and, therefore, to higher expression 
levels of a gene. In S. cerevisiae vectors containing 
the 2 micron DNA replication origin can reach copy- 
numbers as high as 50. This is considerably . mor^ than 
5 can be reached by for instance integrating vectors or 

vectors containing autonomously replicating sequences 
(ARS) [K. Struhl et al., Proc. Natl. Acad •Sci^USA, 76, 
1035-1039 (1979) and A.J. Kingsman et al. , Gene, 7, 
141-152 (1979)3. However, in yeast species other than 

10 Saccharomyces, 2 micron DNA has not yet been found, 

suggesting that its replication origin will be 
functional in a very limited number of yeast species 
only. Experimental data obtained, so far show that the 2 
micron replication origin is functional in 

15 Schizosaccharomyces pombe CD. Beach and P. Nurse, 

Nature, 290 , 140-142 (1981)3, but not in Kluyveromyces 
lactis (G. Das and CP. Hollenberg, Curr. Genet. 5, 
123-128 (1982)3- 

20 Therefore the transformation of yeasts belonging to 

genera other than Saccharomyces or Schizosaccharo - 
myces will be dependent in most cases upon the avail- 
ability of other DNA replication origins such as ARS 
isolated from the organism to be transformed or upon 

25 the integration of the foreign gene into the yeast 

genome. 

- a codon use of the gene which is optimal for the host 
organism used. Results obtained with the yeast S- 

30 cerevisiae show a strong correlation between the 

abundance of certain tRNA (transfer RNA) species and 
the occurrence of the respective codons in its 
protein genes. Therefore, optimal expression of for 
instance th.e bovine preprochymosin gene or the plant 

35 preprothauraatin gene in S. cerevisiae would require 

a chemical synthesis of both genes with a codon 
population which correlates with these abundant yeast 
tRNA species. 
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- an additional factor which might influence the trans- 
lation of a gene is the presence of a DNA sequence 
encoding a so-called signal sequence. These signal 
sequences are in most cases hydrophobic N-terrainal 
5 protein extensions which are often involved in the 
process of cotranslational secretion of the protein 
through a membrane- In the present specification new 
data are shown obtained with the expression of the 
preprothaumatin gene in yeast, indicating that when 
10 the DNA sequence encoding the signal protein is 

removed from the gene, expression of the gene is 
reduced by a few orders of magnitude. 

At present the number of known yeast species which can 
15 serve as a host for recombinant DNA molecules is 
limited to Saccharomyces cerevisiae and Schizo- 
saccharomyces pombe . 

SUMMARY OF THE INVENTION 

20 

Therefore, a need exists for additional yeast species 
with other metabolic properties and other nutritional 
demands, so that the field to which recombinant DNA 
technology can be applied on a commercial level will be 
25 broadened. However, the use of new yeast species 

requires suitable DNA replication origins to guarantee 
a stable replication of the vector molecule in the new 
host as well as an appropriate expression system for 
newly inserted DNA. The present invention provides an 
30 expression system, the use of which is not restricted 
to S. cerevisiae but which can function in other yeast 
species as well. In order to achieve this the RNA 
initiation/regulation and the RNA termination/ 
polyadenylation signals of a glyceraldehyde-3-phosphate 
35 dehydrogenase (GAPDH) gene of S. cerevisiae were 
isolated. 
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Besides the fact that the RNA initiation site of a 
GAPDH gene is a very efficient promoter in S. 
cerevisiae [M.J* Holland et ajU Biochemistry 17_, 4900- 
4907 (1978)3 the promoter according to the present 
invention was isolated and applied after it was 
realized that GAPDH is a metabolic key enzyme in many 
different organisms , suggesting a certain conservatism 
during evolution with regard to the nucleotide sequence 
regulating its expression- On this basis further ex- 
periments were carried out. It was observed (i) that 
the isolated and radioactively labelled GAPDH reguloh 
fragment hybridized with colonies of various yeast 
species, (ii) the S. cerevisiae GAPDH regulon expressed 
foreign genes in K. lactis efficiently and (iii) the 
larger regulon region of about 850 nucleotides was more 
effective than the smaller regulon region of about 280 
nucleotides published by Holland J. P. and Holland M.J. 
in J.Biol.Chem. 255, 2596-2605 (1980) . These new 
findings gave us. confidence that the regulon isolated 
will prove to be useful in the expression of foreign 
DNA in a number of other yeast species. 

The present invention provides a DNA sequence capable 
of initiating transcription by yeast RNA polymerase II 
which includes at least part of the regulon region of 
one of the GAPDH genes of S. cerevisiae , characterized 
in that it comprises a DNA sequence essentially as 
given in Fig. 2, and wherein the regulon region is 
optionally modified to include at least one restriction 
enzyme cleavage site, to facilitate manipulation of the 
nucleotide sequence region for protein synthesis 
initiation. 

An example of a modification of the regulon is given in 
item 4 and Figs . 7A and 8, where the introduction of a 
Sac I site is described. Although in Fig. 2 only one 
specific DNA sequence is described, it will be clear to 
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the expert that modifications of this DNA sequence, 
either by replacement of one or more nucleotides or by 
addition or deletion of one or more nucleotides, which 
modifications do not impair the properties of- the 
5 regulon region given in Fig. 2, are within the realm 
of the invention. 

The invention further provides a DNA sequence capable 
of both termination of the transcription by yeast RNA 
10 polymerase II and effecting polyadenylation of the 

mRNA, which includes at least part of the termination/ 
polyadenylation region belonging to one of the GAPDH 
genes of S- cerevisiae , characterised in that it com- 
prises a DNA sequence essentially as given in Fig- 3. 

15 

Indications have been obtained that the presence of 
such DNA sequence is favourable for the expression of 
structural genes in yeasts. The presence of such DNA 
sequence seems to be particularly advantageous for in- 
20 corporating the rDNA in the genome of a yeast cell. 

The invention also provides a DNA sequence, which can 
be inserted into a recombinant DNA plasmid or into a 
yeast chromosome, comprising: 
25 (a) a DNA sequence essentially as given in Fig. 2, and 

(b) one or more structural genes different from the 
GAPDH genes of S. cerevisiae , and at least two of 
features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
30 minating the transcription by yeast RNA polymerase 

II and effecting polyadenylation of the mRNA, 
and /or 

(d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
35 stable insertion in a chromosome of yeasts or one 

or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo- 
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myces or to the genus Kluyveromyces , and/or 
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15 



20 



25 



30 



(f ) a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
situated upstream of and . in the same reading frame 
as the structural gene. 



In particular, the structural gene of (b) encodes thau- 
matin or chymosin, or their various allelic or modified 
forms, which modified forms do not impair the sweet- 
tasting properties of thaumatin or the milk-clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin- like or chymo sin-like proteins, since 
such DNA sequence will assist in giving an increased 
expression of these genes. 

A preference exists to use a DNA sequence essentially 
as given in Fig- 3 as the specific DNA sequence of (c) , 
since this seems more adapted to yeast than other 
terraination/polyadenylation regions. 

The DNA sequence of (f ) encoding a signal polypeptide 
can be selected from the group consisting of DNA 
sequences encoding 

(a) the signal polypeptide translocating £. cerevisiae 
inver tas e , namely Met . Leu • Leu . Gin . Ala . Phe . Leu . Phe . - 
Leu . Leu . Ala ^ Gly . Phe . Ala . Ala . Ly s . lie . Ser . Ala ; 

(b) the signal polypeptide translocating S. cerevisiae 
acid phosphatase, namely Met.Phe.Lys*Ser.Val..Val.- 
Tyr. Ser •lie. Leu. Ala. Ala. Ser .Leu. Ala. Asn. Ala; 

(c) the signal polypeptide of unmatured forms, of 
thaumatin-like proteins,, namely Met.Ala.Ala.Thr.- 
Thr . Cys . Phe . Phe . Phe . Leu . Phe . Pro . Phe . Leu . Leu . Leu . - 
Leu . Thr . Leu . Ser . Ar g . Ala; 

(d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met.Arg.Cys.Leu.Val.- 
Va 1 . Leu . Leu . Ala . Val . Phe . Ala ► Leu . Ser . Gin . Gly ; and 
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(e) two consensus signal polypeptides , namely 

Me t . Se r . Ly s . Ala . Al a . Leu . Ala . Phe . I 1 e . Ala . Phe • Val . 
He. Val. Leu. lie. Val. Asn. Ala and 

Met . Ser • Ly s . Phe .Val. lie. Val. Leu . He . Val . Ala . Ala . 
Leu. Ala. Phe. lie. Ala. Asn. Ala. 



10 



15 



20 



In order to malce the conditions for expression as good 
as possible it is advocated to modify the structural 
gene of (b) and/or the signal polypeptide-encoding DNA 
sequence of (f) such that the codons are codons pre- 
ferred by yeasts. Following the teaching of J. P. 
Holland and M.J. Holland (J.Biol.Chem. 255, 2596-2605 
[1980]) the preferred codons are: 
GCC or GCT alanine 



AGA 
AAC 

GAC or GAT 

TGT 

CAA 

GAA 

GGT 

CAC 

ATC or ATT 



arginine 

asparagine 

aspartic acid 

cysteine 

glutamine 

glutamic acid 

glycine 

histidine 

isoleucine 



TTG 
AAG 
ATG 
TTC 
CCA 

TCC or TCT 
ACC or ACT 
TGG 
TAC 

GTC or GTT 



leucine 
lysine 
methionine 
phenylalanine 
proline 
serine 
threonine 
tryptophan 
tyrosine 
valine. 



25 The DNA sequences described above are not a purpose in 
themselves. They will be used in a process for pre- 
paring yeasts containing these DNA sequences by intro- 
ducing these rDNA sequences into Saccharomyces , 
Kluyveromyces , Debaryomyces , Hansenula , Candida , 

30 Torulopsis or Rhodotorula yeasts, either in the form of 
plasmids or by incorporation in the yeast genome. 

The invention further provides yeasts containing such 
rDNA sequences, either in the form of a plasmid or in- 
35 corpora ted in the yeast genome and their use in a pro- 
cess for preparing a protein by cultivation of such 
yeast, whereby the rDNA sequence incorporated in the 
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yeast contains a structural gene encoding that protein 
or a precursor thereof/ which precursor can form the 
relevant protein during processing. 

5 Finally/ the invention provides the proteins produced 
by this novel process* 

Several constructions in which the GAPDH promoter/ 

regulation region was combined with structural genes * 

10 encoding either preprothauraatin or preprochymosin or 

some of the various maturation products have been made 
and synthesis of thaumatin-lilce proteins in yeast have 
been demonstrated. Preferred constructions contained 
structural genes encoding either preprothaumatin or 

15 prethaumatin. To improve the expression yield of said 
genes / the original codons can be replaced by codons 
which are abundantly present in highly expressed genes. 
Moreover, the DNA sequence encoding the signal sequence 
of preprochymosin can be replaced by DNA sequences en- 

20 coding signal sequences of products excreted by yeasts, 
such as the signal sequences invertase and acid phos- 
phatase produced by £3. cerevisiae . 

The DNA sequences according, to the invention may com- 
25 prise nucleotide sequences which regulate DNA repli- 
cation in yeasts belonging to the genera Saccharomyces 
and Kluyveromyces » If the replication origin is in- 
serted into an rDNA plasmid, the following combinations 
are preferred for Saccharomyces: a combination of the 
30 replication origin of the 2 micron DNA with the leu 2 
gene as is present on plasmid pMP81 CC.P. Hollenberg, 
Current Topics in Microbiol, and Immunol. 96, 119-144 
(1982)] and a combination of the replication origin of e 
the 2 micron DNA in combination with the trp 1 gene 
35 present on plasmid YRp7 £D.T. Stinchcomb et al. , Nature 
282 , 39-43 (1979)]. A preferred replication origin for 
Kluyveromyces consists of the KARS-2 sequence in com- 
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bination with the trp 1 gene as is present on plasmid 
pEK 2-7 described in European Patent Application 
N* 0096430 (Al) on pages 21 and 25 and in Fig. 2 
If the DNA sequence has to be inserted into a yeast 
5 genome, it is preferable that the DNA sequence contains 
the termination region of Fig- 3 downstream of the 
structural gene besides the regulon region of Fig. 2 
upstream of the foreign structural gene, which combin- 
ation will be inserted by homologous recombination at 

10 the position of the GAPDH gene in the yeast genome (cf . 
Fig. 19). An alternative may be that the combination 
regulon region - structural gene - termination region 
is inserted into a cloned DNA sequence derived from 
yeast genome (K. Struhl, Nature 305 , 391-397 [1983] and 

15 R.J. Rothstein, Methods in Enzymology 101 , 202-211 
[1983]). 

For a better understanding of the invention the most 
important terms used in the description will be 
20 defined: 

An operon is a gene comprising (a) a particular DNA 
sequence [structural gene(s)] for polypeptide (s) ex- 
pression, (b) a control region or regulon (regulating 
said expression) upstream of the structural gene and 
25 mostly consisting of a promoter regulation sequence, 

(c) a ribosome binding- or interaction DNA sequence and 

(d) a control region or transcription terminator 
downstream of the structural gene. 

30 Structural genes are DNA sequences which encode through 
a template (mRNA) a sequence of amino acids charac- 
teristic for a specific polypeptide. 

A promoter is a DNA sequence within the regulon to 
35 which RNA polymerase binds for the initiation of the 
transcription. 

A terminator is a DNA sequence within the operon 
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comprising amongst others particular . DNA sequences 
involved in the polyadenylation of mRNA and particular 
DNA sequences involved in the termination of the 
transcription of DNA by RNA-polymerase + 

5 

Reading frame * The grouping of triplets of nucleotides 

(codons) into such a frame that at mRNA level a proper & 

translation of the codons into the polypeptide takes 

place. - 

10 

Transcription . The process of producing mRNA from a 
structural gene. 

Translation . The process of producing a polypeptide 
15 from mRUA. 

Expression . The process undergone by a structural gene 
to produce a polypeptide. It is a combination of many 
processes, including at least transcription and 
20 trans lation . 

By signal sequence (also called signal polypeptide or 
leader sequence) is meant that part of the pre(pro)- 
protein which has a high affinity to biomerabranes or 
25 special receptor-proteins in biomembranes and/or which 
is involved in the transport/ translocation of pre(pro)- 
protein. These transport/ trans location processes axe 
often accompanied by processing Of the pre (pro) protein 
into one of the mature forms of the protein. 

30 

Allelic form . One of the two or more naturally a 
occurring alternative forms of a gene product. 

Chromosome . Thread-like structures into which the 
35 hereditary material of cells is associated. 

Genome. The total genetic information of cells or- 
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ganised in the chromosomes. 

Maturation form . One of two or more naturally occur- 
ring forms of a gene product procured by specific 
5 processing, e.g. specific proteolysis. 

By maturation forms of preprothaumatin are meant 
prothaumatin, prethaumatin (= preprothaumatin without 
a carboxyl terminal sequence of 6 amino acids) and 
thaumatin (EP-PA 54330 and EP-PA 54331). 
10 By maturation forms of preprochymosin are meant 
prochyraosin, pseudo-chymosin and chymosin (EP-PA 
77109). 

Plus strand . DNA strand whose nucleotide sequence 
15 is identical with the mRNA sequence, with the proviso 
that uridine is replaced by thymidine. 

The microbial cloning vehicles - containing (a) the 
various forms of the regulons and terminators (indi- 
20 cated with the suffix -01, -02, etc. in the plasmids 

described in the specification) of the glyceraldehyde- 
3-phosphate dehydrogenase (GAPDH) operon of S. 
cerevisiae , (b) structural genes encoding prepro- 
thaumatin and preprochymosin and their various 

25 maturation forms, (c) various hybrid forms of said 

structural genes encoding maturation forms of prepro- 
thaumatin or preprochymosin with special signal 
sequences and (d) various chemically synthesized DNA- 
sequences - are produced by a number of steps, the most 

30 essential of which are: 

1. Isolation of clones containing the GAPDH operon of 
S. cerevisiae . 

2. Isolation of the GAPDH promoter/regulation region 
35 and its introduction into plasmids encoding 

thaumatin-precursors . 

3. Introduction of the GAPDH promoter/ regulation 
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region into plasmids encoding chymosin-precursors . 

4. Reconstitution of the original GAPDH promotor/ 
regulation region in plasmids encoding prepro- 
thaumatin by introduction of a synthetic DftA frag- 

5 ment (Fig- 7A, Fig. 8). 

5. Reconstitution of the original GAPDH promoter/ 
regulation region in plasmids encoding 

(pre) (pro) (pseudo)chymosin by introduction of a 
synthetic DNA fragment* 
10 6. DtTA-synthesis. 

7. Structural features of the GAPDH promoter /regula- 
tion region. 

8. Insertion of fragments of the GAPDH transcription 
termination/polyadenylation region in combination 

15 with the central transcription termination signal 

of phage M13RF downstream of genes encoding 
pseudochymosin. 

9. Introduction of the 2 micron DNA replication origin 
and the yeast leu 2 gene in plasmids encoding 

20 thaumatin-precursors and chymosin-precursors . 

10. The introduction of GAPDH transcription termination/ 
polyadenylation regions into pURY plasmids. 

11. Construction of an E. coli- yeast shuttle vector 
widely applicable for gene-expression in yeast. 

25 12. Expression in K. lactis of the preprothauraatin 
encoding gene under control of the promoter/ 
regulation region of the GAPDH encoding gene of 
S.. cerevisiae * 
13- Integration of structural genes under control of 

30 the GAPDH promoter/regulation region into the yeast 

chromosome. 

14. Chemical synthesis of structural genes and con- 
struction of synthetically chimeric genes. 

35 The following detailed description will illustrate the 
invention. 

1. Isolation of clones containing the glyceraldehyde- 
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3 -phosphate dehydrogenase (GAPDH) operon of 
cerevisiae . 

A DNA pool of the yeast cerevisiae was prepared 
5 in the hybrid E. coli-yeast plasmid pFl 1 [M. 

Chevallier et al . Gene 11, 11-10 (1980)] by a method 
similar to the one described by M. Carlson and D. 
Botstein, Cell 28, 145-154 (1982). Purified yeast 
DNA was partially digested with restriction 
10 endonuclease Sau 3A and the resulting DNA fragments 

(with an average length of 5 kb) were ligated by T4 
DNA ligase in the dephosphorylated Bam HI site of 
P F1 1. After transformation of CaCl 2 -treated E^ 
coli cells with the ligated material a pool of about 
15 30.000 ampicillin resistant clones was obtained. 

These clones were screened by a colony hybridization 
procedure CR-E. Thayer, Anal. Biochem. , 98, 60-63 
(1979)3 with a chemically synthesized and P- 
labelled oligomer with the sequence 
20 5 ' TACCAGGAGACCAACTT3 ' . 

According to data published by J. P. Holland and M.J. 
Holland [J. Biol. Chem. , 255, 2596-2605, (1980)3 
this oligomer is complementary with the DNA sequence 
encoding aminoacids 306-310 (the wobble base of the 
25 last amino acid was omitted from the oligomer) of 

the GAPDH gene. Using hybridization conditions 
decribed by R.B . Wallace et al. , Nucleic Acid Res. 
9, 879-894 (1981), six positive transformants could 
be identified. One of these harboured plasmid pFl 1- 
30 33. The latter plasmid contained the GAPDH gene 

including its promoter/ regulation region and its 
transcription termination/polyadenylation region. 

The approximately 9 kb long insert of pFl 1-33 has 
35 been characterized by restriction enzyme analysis 

(Fig. 1) and partial nucleotide sequence analysis 
(Figs . 2 and 3) . 
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Note: Unless stated otherwise, all enzyme 

incubations were carried out under conditions 
described by the supplier. Enzymes were 
obtained form Amersham t Boehringer r BRL or 
5 Biolabs. 

2. Isolation of the GAPDH promoter/regulation region 
and its introduction into plasmids encoding 
thaumatin-precursors (Fig. 4) 

10 

On the basis of the restriction enzyme analysis and 
the nucleotide sequence data of. the insert of 
piasmid pFl 1-33/ the RNA initiation regulation 
region of the GAPDH gene was isolated as an 800 

15 nucleotides long Dde I fragment. To identify this 

promoter fragment, piasmid pFl 1-33 was digested 
with Sal I and the three resulting DNA fragments 
were subjected to a Southern hybridization test 
with the chemically synthesized oligomer CE.M. 

20 Southern, J.Mol.Bidl. 98, 503-517 (1975)3. A posi- 

tively hybridizing 4.3 leb long restriction fragment 
was isolated on a preparative scale by electroelu- 
tion from a 0.7% agarose gel and was then cleaved 
with Dde X. Of the resulting Dde I fragments only 

25 the largest one had a recognition site for Pvu II, 

a cleavage site located within the GAPDH promoter 
region (Fig. 1) . The largest Dde I fragment was 
isolated and incubated with Klenow DNA polymerase 
and four dNTP's (A. R. Davis et al . , Gene 10, 205- 

30 218 (1980)) to generate a blunt-ended DNA molecule. 

After extraction of the reaction mixture with 
phenol/chloroform (50/50 v/v), passage of the 
aqueous layer through a Sephadex G50 column and 
35 ethanol precipitation of the material present in 

the void volume , the DNA fragment was equipped with 
the 32 P-labelied Eco RI linker 5 ' GGAATTCC3 ' by 
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incubation with T4 DNA ligase. Owing to the Klenow 
DNA polymerase reaction and the subsequent ligation 
of the Eco RI linker, the original Dde I sites were 
reconstructed at the ends of the promoter fragment. 
5 To inactivate the ligase the reaction mixture was 

heated to 65°C for 10 minutes, then sodium chloride 
was added (final concentration 50 mmol/l) and the 
whole mix was incubated with Eco RI. Incubation was 
terminated by extraction with phenol/ chloroform, 
10 the DNA was precipitated twice with ethanol, re- 

suspended and then ligated into a suitable vector 
molecule. Since the Dde I promoter fragment was 
equipped with Eco RI sites it can be easily 
introduced into the Eco RI site of pUR 528, pUR 523 
15 and pUR 522 (EP-PA 54330 and EP-PA 54331) to create 

plasmids in which the yeast promotor is adjacent to 
the structural genes encoding thaumatin precursors. 
The latter plasmids were obtained by cleavage of 
pUR 528, pUR 523 and pUR 522 with Eco RI, treatment 
20 of the linearized plasmid molecules with (calf 

intestinal) alkaline phosphatase to prevent self- 
ligation and incubation of each of these, vector 
molecules, as well as the purified Dde I promotor 
fragment, with T4 DNA ligase. Transformation of the 
25 various ligation mixes in CaCl 2 -treated E. coli 

HB101 cells yielded several ampicillin resistant 
colonies. From some of these colonies plasmid DNA 
was isolated [H.C. Birnboim and J. Doly, Nucleic 
Acids Res. 7, 1513-1523 (1979)] and incubated with 
30 Pvu II to test the orientation of the insert. 

In the nomenclature plasmids containing the Eco RI 
(Dde I) GAPDH promoter fragment in the correct 
orientation (i.e. transcription from the GAPDH 
35 promoter occurs in the direction of a downstream 

located structural gene) are indicated by the 
addendum-01 to the original code of the plasmid 
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(for example pUR 528 is changed into pUR 528-01; 
see Fig. 4) . 

To facilitate manipulation of plasmids containing 

5 the Eco RI promoter fragment , one of the two Eco RI 

sites was destroyed* Two /ug of plasmid DNA (e.g. 
pUR 528-01) was partially digested with Eco RI and 
then incubated with 5 units Mung bean nuclease 
(obtained from P.L. Biochemicals Inc.) in a total 

10 volume of 200 ^ul in the presence of 0.05 moles 

per 1 sodium acetate (pH 5.0) , 0*05 moles/l sodium 
chloride and 0.001 moles/l zinc chloride for 30 
minutes at room temperature to remove sticky ends * 
The nuclease was inactivated by addition of SDS to 

15 a final concentration of 0.1% ED. Kowalski et al . , 

Biochemistry X5, 4457-4463 (1976)3 and the DNA was 
precipitated by the addition of 2 volumes of 
ethanol (in this case the addition of 0.1 volume of 
3 moles/l sodium acetate was omitted). Linearized 

20 DNA molecules were then religated by incubation 

with T4 DNA ligase and used to transform CaCl 2 - 
treated E. coli cells. Plasmid DNA isolated from 
arapicillin resistant colonies was tested by 
cleavage with Eco RI and Mlu I for the presence of 

25 a single Eco RI site adjacent to the thauraatin gene 

(cf Fig. 4) . 

Plasmids containing the GAPDH promoter fragment, 
but having only a single Eco RI recognition site 
30 adjacent to the ATG initiation codon of a down- 

stream located structural gene, are referred to as 
-02 type plasmids (for example: pUR 528-01 is 
changed into pUR 528-02; see Fig. 4). 

35 3. Introduction of the GAPDH promoter/regulation 

region into plasmids encoding chymosin-precursors. 
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To construct plasmids containing the GAPDH 
promoter/regulation region adjacent to structural 
genes encoding chymosin precursors, use was made of 
plasraid pUR 528-02 (cf 2) digested with Eco RI and 
5 Hind III as a vector molecule in which various 

structural genes were inserted. To overcome the 
problem that all of the (pre) (pro) (pseudo) chymosin 
encoding plasmids (pUR 1524, pUR 1523 and pUR 1521, 
respectively, as described in EP-PA 77109) contain 

10 an additional Eco RI site within the structural 

gene, a Hind III digestion was carried out in com- 
bination with a partial Eco RI digest. Restriction 
fragments containing the intact and isolated gene 
were extracted from the 1% agarose gel and added to 

15 a T4 DNA ligation mix together with the vector. The 

vector was prepared by digesting pUR 528-02 with 
Hind III and Eco RI, treating the resulting frag- 
ments with phosphatase and isolation of the largest 
fragment from a 0.7% agarose gel. After trans- 

20 formation of CaCl 2 -treated E- coli cells with the 

ligation mix and selection for ampicillin resistant 
colonies, plasmids containing the GAPDH promoter/ 
regulation fragment adjacent to the (pre)(pro)- 
(pseudo) chymosin encoding structural genes in the 

25 appropriate orientation could be isolated. In a 

similar way, plasmid pUR 1522 can be converted into 
plasmid puR 1522-02. 

Plasmids containing the GAPDH promoter fragment and 
30 the genes coding for prepro-, pro-, pseudochymosin 

and chymosin are referred to as pUR 1524-02, pUR 
1523-02, pUR 1521-02 and pUR 1522-02, respectively. 

4. Reconstitution of the original GAPDH promoter/ 
35 regulation region in plasmids encoding preprothau- 

matin by introduction of a synthetic DNA fragment 
(Fig. 7A, Fig. 8). 
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As shown by the nucleotide sequence depicted in 
Fig. 2, the Eco RI (Dde I) GAPDH promoter fragment 
contains the nucleotides -844 to -39 of the origi- 
nal GAPDH promoter/regulation region. Not contained 
in this promoter fragment are the 38 nucleotides 
preceding the ATG initiation codon of the GAPDH 
encoding gene. The latter (38) nucleotide fragment 
contains the PuCACACA sequence, which is found in 
several yeast genes. Said PuCACACA sequence located 
about 20 bp upstream of the translation start site 
[M.J. Dobson et al. r Nucleic Acid Res-, 10 , 2625- 
2637 (1982)3 provides the nucleotide sequence which 
is optimal for protein initiation Cm. Kozak, 
Nucleic Acids Res. 9, 5233-5252 (1981)]. Moreover, 
as shown in Fig. 6, these 38 nucleotides allow the 
formation of a small loop structure which might be 
involved in the regulation of expression of the 
GAPDH gene. 

On the basis of the above-mentioned arguments, in- 
troduction of the 38 nucleotides between the Dde I 
promoter- fragment and the ATG codon of a downstream 
located structural gene was considered necessary to 
improve promoter activity as well as translation 
initiation. 

As outlined in Fig. 7A the .missing DNA fragment was 
obtained by the chemical synthesis of two partially 
overlapping oligomers. The Sac I site present in 
the overlapping part of the two oligonucleotides 
was introduced for two reasons: (i) to enable 
manipulation of the nucleotide sequence immediately 
upstream of the ATG codon including the construc- 
tion of poly A- tailed yeast expression vectors (see 
11)? (ii) to give a cleavage site for an enzyme 
generating 3 1 -protruding ends that can easily and 
reproducibly be removed by incubation with T4 DNA 
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polymerase in the presence of the four dNTP's. 
Equiraolar amounts of the two purified oligomers 
were phosphorylated at their 5 '-termini, hybridized 
[J.J. Rossi et al ., (1982), J. Biol. Chem.257, 
5 9226-9229] and converted into a double- stranded DNA 

molecule by incubation with Klenow DNA polymerase 
and the four dNTP's under conditions which have 
been described for double-stranded DNA synthesis 
CA.R. Davis et al . , Gene 10, 205-218 (1980)]. 

10 Analysis of the reaction products by electro- 

phoresis through a 13% acrylamide gel followed by 
autoradiography showed that more than 80% of the 
starting single-stranded oligonucleotides were 
converted into double-stranded material. The DNA 

15 was isolated by passage of the reaction mix over a 

Sephadex G50 column and ethanol precipitation of 
the material present in the void volume. The DNA 
was then phosphorylated by incubation with poly- 
nucleotide kinase and digested with Dde I . To 

20 remove the nucleotides cleaved off in the latter 

reaction, the reacton mix was subjected to two 
precipitations with ethanol. 

As shown in Fig. 8, cloning of the resulting syn- 
25 thetic DNA fragment was carried out by the simul- 

taneous ligation of this fragment and a Bgl II-Dde 
I GAPDH promoter regulation fragment in a vector 
molecule from which the Eco RI site preceding the 
ATG initiation codon was removed by Mung bean 

30 nuclease digestion (cf. 2). The Bgl II-Dde I 

promoter/ regulation fragment was obtained by 
digestion of plasmid pUR 528-02 with Dde I and Bgl 
II. Separation of the resulting restriction frag- 
ments by electrophoresis through a 2% agarose gel 

35 and subsequent isolation of the fragment from the 

gel yielded the purified 793 nucleotides long 
promoter/regulation fragment. In the plasmid pUR 
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528-02 the nucleotide sequence preceding the ATG 
codon is 5 1 -GAATTCATG- 3' (EP-PA 54330 and EP-PA 
54331), which is different from the favourable 
nucleotide sequence given by M. Kozak CNucieic 
Acids Res. 9/ 5233-5252 (1981)]. Since our aim was 
to reconstitute the original GAPDH proraoter/regula- 
tion/protein initiation region as accurately as 
possible, the Eco RI site was removed in order to 
ligate the synthetic DNA fragment to the resulting 
blunt-end. Removal of the Eco RI site was accom- 
plished by Mung bean nuclease digestion of Eco RI- 
cleaved pUR528-02 DNA (see 2). 

Subsequently the plasmid DNA was digested with Bgl 
II and incubated with phosphatase. After separation 
of the two DNA fragments by electrophoresis through 
a 0.7% agarose gel., the largest fragment was 
isolated and used as the vector in which the Bgl 
II-Dde I promoter fragment as well as the -Dde I- 
treated- synthetic DNA fragment were ligated. 
Plasmids in which the Dde I promoter/ regulation 
fragment together with the Sac I recognition 
site containing . the synthetic DNA fragment are 
introduced are indicated by the addendum -03 (for 
example: . pUR 528-02 is changed into' pUR 528-03). 

Similar results can be obtained with plasmids 
containing one of the maturation forms of prepro- 
thaumatin as the structural gene, i.e.. prethau- 
matin, prothaumatin and thaumatin, which will 
result in plasmids pUR 522-03, pUR 523-03 and 
pUR 520-03, respectively. 

In order to reconstitute the original GAPDH 
promoter/ regulation region as accurately as 
possible, the Sac I site was removed from plasmid 
pUR 528-03. (Fig. 9). This was accomplished by 
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digestion of the plasmid DNA with Sac I to generate 
a linearized plasmid molecule with protruding 3* 
ends. These ends were then made blunt-ended using 
the 3 ' -exonuclease activity. of T4 DNA polymerase in 
5 the presence of the four dNTP's [T. Maniatis et al. 

in Molecular Cloning; Cold Spring Harbor Laboratory 
117-120 (1982)]. Circularisation of the linear 
plasmid was accomplished by using T4 DNA ligase. 

10 Plasmids from which the Sac I site present in the 

synthetic DNA fragment is removed are referred to 
as -04 type plasmids (for example: pUR 528-03 is 
changed into pUR 528-04? see Fig. 9). 

15 Similar results can be obtained with plasmids con- 

taining a structural gene for one of the maturation 
forms of preprothaumatin, i.e. prethaumatin, pro- 
thaumatin and thaumatin, which will result in e.g. 
plasmids pUR 522-04, pUR 523-04 and pUR 520-04, 

20 respectively. 

5. Reconstitution of the original GAPDH promoter/ 

regulation region in plasmids encoding (pre) (pro) 
(pseudo)chymosin by introduction of a synthetic DNA 
25 fragment (Fig. 7B, Fig. 10). 

To construct (pre) (pro) (pseudo)chymosin encoding 
plasmids containing the -03 type GAPDH promoter/ 
regulation region (see 4), use can be made of plas- 

30 mid pUR 528-03 digested with Sac I and Hind III as 

a vector molecule in which the various structural 
genes were inserted together with a synthetic DNA 
fragment (Fig. 10). The (pre) (pro) (pseudo)chymosin 
encoding genes can be isolated from the plasmids 

35 pUR 1524, pUR 1523. pUR 1521 and pUR 1522 (cf. 

EP PA 77109) respectively by incubation with Sal I 
in combination with a partial Eco RI digestion to 
overcome cleavage of the additional Eco RI site in 
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in the chymosin gene (cf 3). The resulting DNA 
fragments can then be incubated with Mung bean 
nuclease (cf 2) followed by a digestion with Hind 
III. Restriction fragments containing the intact 
and isolated gene can be purified by electro- 
phoresis through a 1% agarose gel and then 
extracted from the gel. The synthetic DNA fragment 
to be used in the constructions is depicted in Fig. 
7B. 



The vector molecule can be prepared by digestion of 
plasmid pUR 528-03 with Sac I and Hind III followed 
by a phosphatase treatment of the restriction frag- 
ments and isolation of the largest fragment from 

15 the 0-7% agarose gel. Vector molecule , synthetic 

DNA fragment (Sac I-treated) and the DNA fragment 
containing the (pre) (pro) (pseudo) chymosin encoding 
nucleotide sequence can be incubated with T4 DNA 
ligase and transformed in CaCl 2 -treated E» coli 

20 cells. Plasmid DNA obtained from ampicillin- 

resistant colonies can be tested by incubation with 
various restriction enzymes. 

The nomenclature of the newly created plasmids is 
25 similar to the nomenclature of the (pre) (pro) - 

thaumatin-encoding plasmids (pUR 1524-03, pUR 1523- 
03/ pOR 1521-03 and pUR 1522-03) . Removal of the 
Sac I site has been described also (see 4). Removal 
of the Sac I sites will result in the plasmids pUR 
30 1524-04, pUR 1523-04, pUR 1521-04 and pUR 1522-04, 

respectively (see Pig. 10). 

6. DNA synthesis. 

35 Desired oligonucleotides were synthesized on a 

polystyrene support using phosphotriester method- 
ology and a library of dimers. C^.A. van der Marel 
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etal. , Recl.Trav.Chim. Pays-Bas 101, 234-241 
(1982)]. Chloromethylated polystyrene (1.34 
mraoles/gram) was functionalized with 2- (4- 
hydroxylphenyl) ethanol [Su-Sun Wang, J.AmlChem. 
Soc. 95, 1325-1333 (1973)3 and then coupled with 5' 
phosphorylated uridine protected as a mixture of 2' 
and 3' acetate and levulinyl groups (analogously to 
G.A. van der Marel et al. vide supra) . This gave a 
support with a levulinyl functionality of 130 
/umoles/g). The synthesis cycle was carried out 
in a 20 ml 2-necked pear shaped flask, one neck of 
which carried a sintered glass filter. This allowed 
the functionalized polystyrene (60 mg) to remain in 
the flask throughout the synthesis which consisted 
15 of the following steps, (cf. Pig. H) • 

1). Removal of the "levulinyl" group with hydrazine 
hydrate (0.5 mol/l) in propionic acid/pyridine 
(1:3 v/v) for 5 minutes. 
20 2). Washing with 2 x 2 ml pyridine. 

3 ) . Addition of a fourfold molar excess of the 

required dinucleotide anion (60 /umoles) 
protected in the 5' position as levulinyl 
ester. This was coevaporated twice with 

25 pyridine and reduced to approximately 0.5 ml 

before adding MSNT (240 /umoles) [OB. Rees 
et al. , Tetrahedron Letters 2727-2730 (1978)]. 
The mixture was shaken for 60 minutes (except 
the first cycle which was lengthened to 90 

30 minutes). 

4) . Washing with 2 x 2 ml pyridine. 

5) . Addition of acetic anhydide (0.2 ml) and 

dimethyl aminopyridine in pyridine (1.5 ml, 
0.05 M) to react with any unreacted hydroxy 1 
35 group during 5 minutes. 

6) . Washing with 2 x 2 ml pyridine. 

This cycle was repeated with the appropriate 
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dimers until the desired sequence had been 
prepared. The 2- chlorophenyl phosphate 
protecting groups were removed with 0.3 mol/l 
1,1,4, 4-tetramethyI guanidinium 2- 
pyridinealdoximate (C.B. Reese .et al. , vide 
supra ) in dry acetonitrile for. 24 hours. After 
filtration and washing of the support the base 
protecting groups (dA = benzoyl, dC « anisoyl, 
dG a diphenylacetyl) were cleaved with 
concentrated aqueous ammonia for .60 hours at 
50° which also cleaves the sequence from the 
uridine attached to the support [R. Crea and T. 
" Horn, Nucleic Acids Res., 8, 2331-2348 (1980)}. 
Filtration of the aqueous phase from the 
support and evaporation gave a crude mixture 
which was given a clean-up by chromatography on 
a short column (40 cm) of Sephadex G50 with 
0.05 M triethylammonium bicarbonate as eluent. 
Collecting and evaporating the first five 
fractions containing UV absorbing material gave 
a concentrate suitable for preparative gel 
electrophoresis. 

Structural features of the GAPDH promoter/regu- 
lation region (Fig. 2, Fig. 6) 

TATA or TATAAA sequences are believed to play an 
important role in positioning RNA initiation sites 
in eucaryotic promoter structures which are 
recognized by RITA polymerase II [C. Breathnach and 
P. Chambon, Annu. Rev. Biochem. 50, 349-384 

(1981) ]. Usually transcription is initiated 25 to 
30 nucleotides 3 * to such sequences although for 
yeast varying distances (up to 70 nucleotides; M.J. 
Dobson etal. , Nucleic Acids Res. 10, 2625-2637 

(1982) ) have been described. According to the 
nucleotide sequence data obtained during the 
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present investigations for the GAPDH promoter/regu- 
lation region, this structure contains two 
additional sets of TATA and TATAAA sequences which 



fragment (Fig. 2). Most likely the clustered . 
5 1 TATATAA 3 1 sequence occurring around position -130 
is responsible for transcription of the GAPDH 
encoding gene. The two other TATA and TATAAA 
sequences present around positions -608 and -770 
are possibly involved in regulation of GAPDH 
expression (see below; P.K. Maitra and Z. Lobo, J. 
Biol. Ghent. , 246 , 489-499 (1971)). Besides these 
RNA initiation signals, the GAPDH promoter/regula- 
tion fragment contains various nucleotide sequences 
which are implicated in transcription termination. 
K.S. Zaret and F. Sherman [Cell 28, 563-573 (1982)3 
found the sequences TAG-N-TAGT-N ' -TTT or TAG-N"- 
TATGT-N" ' -TTT, in which N,N* ,N" and N" 1 represent 
the variable distances between the groups, in the 
3' flanking sequences of a majority of yeast genes 
examined. Identical or similar sequences occur in 
the GAPDH promoter /regulation fragment around 
position -625 (TAT-N4-TAGT-N5-TTT) , around 
position -324 (TAC-N 13 -TAGT-N 17 -TTA, around 
position -192 ( TAC-N ^ -TATGT-N^ -TTT ) and around 
position -180 (TAT-N 17 -TAGT-N 23 -TTT) . In Fig. 6 
these postulated termination signals are repre- 
sented by slashes. The largest open translation 
reading frame present on the GAPDH promoter/regula- 
tion fragment extends from position -450 to -337 
and encodes a peptide of 38 amino acids long. Since 
the TATA box around position -608 precedes the open 
reading frame it might well be that the putative 
peptide is translated from a transcript initiated 
downstream of this TATA sequence. Particularly 
interesting is the observation that the ATG 
initiation codon of the peptide forms part of a 



are located towards the ends of the promoter 
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secondary structure which is generated upon base 
pairing of the nucleotides extending from -448 to 
-436 with the nucleotides extending from -419 to 
-407. These features are reminiscent of th6 
situation described for the yeast leu 2 gene [A. 
Andreadis et al. , Cell 31, 319-325 (1982)] and 
together with the presence of the small stem/loop 
structure preceding the ATG codon of the GAPDH 
encoding gene they might be involved in regulating 
the expression of the latter gene. 

8. Insertion of fragments of the GAPDH transcription 
teriaination/polyadenylation region in combination 
with the central transcription termination signal 
of phage M13 RF downstream of genes encoding 
pseudochymosin (Pig. 12). 

To isolate fragments of the GAPDH transcription 
termination/polydenylation region, plasmid pFl-33 
was cleaved with restriction enzyme Afl II. Diges- 
tion with this enzyme yielded two fragments, the 
smaller of which has a length of 1307 nucleotides 
and encompasses the GAPDH termination/ polyadenyla- 
tion region from position 11 to 1317 (Fig. 12). The 
latter fragment was isolated, incubated with Mung 
bean nuclease to generate blunt ends (cf 2) and 
then equipped with the Bam HI linker 
(S'CCGGATCCGGS 1 ) using T4 DNA ligase.. Owing to the 
presence of a naturally occuring Bam HI site in the 
middle (around position 690) of the Afl II fragment 
isolated, digestion of the ligation mix with Bain HI 
resulted in two fragments (A and. B; Fig. 12). The 
larger of these two fragments (A) has a length of 
677 nucleotides and contains the nucleotide 
sequence region which is located immediately down- 
stream of the TAA translation termination codon. 
Attempts to subclone the purified fragment A in the 
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the Bam HI site of pBR322 proved to be unsuccess- 
ful since very few transf ormants were obtained and 
these transformants always contained fragment A as 
well as fragment B in various orientations I (cf. 
5 plasmid 294-17 depicted in Fig. 12). 

To overcome this instability of fragment A, use 
was made of the central transcription termination 
signal of bacteriophage Ml 3 RF [L. Edens et al. , 

10 Nucleic Acids Res. 2, 1811-1820 (1975)]. M13 RF was 

digested with Taq I and the resulting fragments 
were made blunt end by incubation with Mung bean 
nuclease. The fragments were then equipped with a 
Hind III linker ( 5 1 AGAAGCTTCT3 1 ) using T4 DNA 

15 ligase followed by a digestion with Hind III. The 

DNA was precipitated from the reaction mix by the 
addition of two volumes ethanol. The precipitate 
was resuspended and the various restriction 
fragments were separated by electrophoresis through 

20 a 4% acryl-amide gel. From this gel the 441 

nucleotides long fragment containing the central 
transcription termination signal was isolated- 
After cleavage of the purified fragment with Sau 

25 3A, the fragment containing the nucleotides 1509 to 

1717 [P.M.G. van Wezenbeek et al . , Gene 11, 129-148 
(1980)] was closed in pBR322 which had been 
digested with Hind III and Bam HI (Fig. 12). 

30 To create plasmids in which the 3 ' -untranslated 

region of the genes encoding (pre) (pro) (pseudo) 
chymosin was replaced by the 3 1 -untranslated region 
of the GAPDH encoding gene, plasmid pUR 1521-02 was 
isolated from an B. coli strain deficient in 

35 adenine methylase and cleaved with restriction 

endonuclease Bel I. This enzyme recognizes the (un- 
methylated) sequence 5 1 TGATCA 3' and cleaves the 



OMH 

wipo 



PCT/EP84/Q0153 



(pre) (pro) (pseudo)chymosin encoding genes within 
their translation termination codon TGA. To 
generate suitable vector molecules in which both 
fragment A and the Ml 3 transcription termination 
signal could be cloned, the Bel I-cleaved plasmid 
molecules were incubated first with Sal I, followed 
by alkaline phosphatase. The larger of the two 
restriction fragments generated was isolated from 
the 0.7% agarose gel. For the isolation of fragment 
A, plasmid 294-17 was digested with Blind III and 
cleaved partially with Bam HI. The 1020 nucleotides 
long Hind III-Bam Hi fragment containing the Hpa I 
site was obtained by electro-elution from, the 1% 
agarose gel. The latter fragment, the vector and 
the 485 nucleotide long Hind Ill-Sal I fragment 
obtained from pBR322 containing the Ml 3 tran- 
scription termination signal were incubated with T4 
DNA ligase and transformed in CaC^-treated E. 
coli cells. This finally yielded the plasmid pUR 
1521-12. Similar results can be obtained with plas- 
mids pUR 1524-02, pUR 1523-02 arid pUR 1522-02, 
which will result in plasmids pUR 1524-12, pUR 
1523-12 and pUR 1522-12, respectively. 

In the nomenclature plasmids containing fragment A 
and the M13 transcription termination signal down- 
stream of a newly inserted structural gene are 
indicated by replacing the addendum -Ox with -lx. 

Introduction of the 2 micron DNA replication origin 
and the yeast leu 2 gene in plasmids encoding 
thauma tin-precursors and chymos in -precursors (Fig. 
13, Fig. 14) . 

The E. coli-yeast shuttle vector pMP81 [Fig. 13- ; 
C.P. Hollenberg, Current Topics in Microbiol, and 
Immunol., 96, 119-144, (1982)] consists, of plasmid 
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pCRI [C. Covey et al ♦ , MGG, 145, 155-158 (1976)] 
and a double Eco RI fragment of pJDB 219 £J.D. 
Beggs, Nature, 275, 104-109 (1978)] carrying both 
the leu 2 gene and the yeast 2 micron DNA • 
5 replication origin. The latter two functions can 

be excised from pMP81 by a digestion with Hind III 
and Sal I. The resulting 4.4 kb long restriction 
fragment was introduced into the various pBR 322 
derivatives containing genes encoding thaumatin- 
10 precursors in combination with the various forms of 

the GAPDH promoter region of S. cerevisiae. A 
similar procedure was used to introduce the 4.4 kb 
long restriction fragment into the various pBR322 
derivatives containing genes encoding chymosin- 
15 precursors in combination with the GAPDH promoter 

region of S. cerevisiae . 

The introduction of the 2 micron replication origin 
and leu 2 gene containing Hind Ill-Sal I fragment 
into the various plasmids was accomplished by 
cleavage of the E. coli plasmids with Hind III and 
Sal I (cf. Fig. 14) and a subsequent treatment of 
the resulting fragments with phosphatase. After 
separation of the fragments by electrophoresis 
through a 1% agarose gel, the largest fragment was 
isolated, mixed with the purified Hind Ill-Sal I 
fragment obtained from pMPSl, ligated with T4 DNA 
ligase and transformed to CaCl 2 -treated E. coli 
cells. 

30 

In the pseudochymosin encoding plasmid containing 
termination fragment A (pUR 1521-12; cf. Fig. 12), 
the 4.4 kb long Hind Ill-Sal I fragment was in- 
serted as well. For this purpose plasmid pUR 
35 1521-12 was digested with both Hind III and Sal I 

(to remove the Ml 3 transcription termination 
region) after which the 2 micron origin and leu 2 
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gene containing fragment from pMP81 could be in- 
serted. Unexpectedly, this plasmid construction 
(pURY 1521-12) proved to be stable in E. coli . 

From some of the ampicillin-resistant transformants 
plasmid DNA was isolated and subjected to restric- 
tion enzyme analysis. Plasmids containing the cor- 
rect insert were purified by CsCl-ethidiura bromide 
density gradient centrifugation and used to trans- 
form £. cerevisiae AH22 (A. Hinnen et ai. , 
Proc. Natl. Acad. Sci. USA 75, 1929-1933 (1978)) 
according to the procedure of J.D. Beggs, Nature 
275 , 104-109 (1978). This resulted in plasmids 
indicated by the letter code pURY but having the - 
same figure codes (cf . Fig. 14 in which the con- 
version of plasmid pUR 528-03 into pURY 528-03 is 
indicated). Similarly , plasmids pUR 522-02, pUR 
523-02, pUR 1524-02, pUR 1523-02 and pUR 1521-02 
were converted into their corresponding pURY plas- 
mids. 

With the availability of yeast transformants con- 
taining the newly constructed plasmids, the effects 
of plasmid variation on gene expression could be 
monitored. For this purpose an enzyme-linked im- 
munosorbent assay CElisas A Voller et al», Bull. 
World Health Orgjan. 53, 55-65 (1976)] for the 
thaumatin was developed and used to quantitate the 
amounts of thaumatin-like protein present in yeast 
extracts. The results obtained in these experiments 
show that upon the introduction of the -02 or -03 
type GAPDH promoter in pURY 528 (cf. 2, 4) , 
thaumatin synthesis is increased by more than two 
orders of magnitude (more than 100 times). Upon the 
introduction of the 280 nucleotides long promoter 
fragment described by J. P. Holland and M.J. Holland 
tJ.Biol.Chem. , 255, 2596-2605, (1980)], however, 
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15 



thaumatin synthesis increased by order of magnitude 
of only one (about 10 times), hereby demonstrating 
the important role played by upstream sequences of 
the GAPDH promoter in its functioning, which is 
contrary to the conclusions of Holland and Holland 
[cf. J.Biol. Chem. 254, 9839-45 (1979)]. 

In another experiment the expression of prepro- 
thaumatin and prothaumatin encoding genes was com- 
pared. As expression of the prothaumatin encoding 
gene turned out to be almost negligible, this 
result clearly demonstrated the enormous impact 
which signal sequences can have on gene expression. 
The importance of the processing step was further 
substantiated by the results shown in Fig. 20. The 
latter experiment demonstrates that yeast cells 
harbouring plasmids encoding preprothaumatin are 
able to produce a thaumatin- liXe protein with a 
molecular weight which is practically the same as 
the molecular weight of thaumatin II molecules 
isolated from the plant. 

This suggests that yeast is able to correctly pro- 
cess the plant signal sequence. Recently obtained 
data on the amino-terminal amino acid sequence of 
yeast-synthesized thaumatin have provided definite 
evidence for this notion. 

In conclusion, the results obtained strongly sug- 
gest that the signal sequence plays an important 
role in either stimulating protein synthesis or in- 
creasing protein stability in yeasts. It is not yet 
sure whether prothaumatin or thaumatin is produced 
by yeasts in view of the smaller difference in 
35 molecular weight (about 3%) between these proteins, 

whereas the difference with preprothaumatin (about 
10%) is easily detectable as was demonstrated in 
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Fig. 20. 

10. The introduction of GAPDH transcription termin- 
ation/polyadenylation regions into pURY plasmids. 

5 

One of the possibilities to introduce GAPDH tran- 
scription termination/polyadenylation regions into 
pURY plasmids is to use the unique Hind III re- 
striction site of these plasmids for insertion of 
10 various parts of the Af.l II fragment depicted in 

Fig. 12. For example, the insertion of tran- 
scription termination fragment A (cf . Fig. 12) can 
be carried out as follows. Cleavage of plasmid pURY 
528-03 with Hind III and a subsequent Mung bean 
15 nuclease digestion (cf. 2) will yield a linearized 

and blunt-ended plasmid molecule (cf. Fig 14). In- 
cubation of this DM with T4 DUA ligase and a suit- 
able Bam HI linker will equip the fragment with Bam 
HI sites (cf. 8). Upon transformation of the Bam 
20 Hl-cleaved and religated product into E. coli , pURY 

528-03 plasmids in which the Hind III site is re- 
placed by a Bam HI site can be recovered. Into this 
newly created Bam HI site transcription termination 
fragment A can be inserted. Digestion of the latter 
25 construction with Hpa I will indicate whether or 

not fragment A is inserted at the correct orien- 
tation, since in the 2 micron DNA sequence an 
additional Hpa I site is available. 

30 Using the same approach but different linker mole- 

cules, different terminator fragments can be intro- 
duced at various sites in the plasmid molecule. 

11. Construction of an E. coli-yeast shuttle vector 
35 widely applicable for the expression of foreign 

genes in yeast (Fig. 15). 

Derivatives of the E. coli-yeast shuttle vectors 
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described under 5 and 9, useful as generally appli- 
cable yeast expression vectors r can be prepared. 
In these derivatives the GAPDH promoter/regulation 
region including a chemically synthesized DNA frag- 
ment can be used for DNA initiation, whereas tran- 
scription termination can be effected by the tran- 
scription termination/polyadenylation region of the 
"Able" operon of the 2 micron DNA. Insertion of 
foreign genes in these expression vectors can be 
accomplished by homopolyraer (poly dA) tailing of 
the Sac I site in the promoter and the Hind III 
site present in the "Able" operon in combination 
with poly dT tailing of the nucleotide sequence to 
be inserted. 

For the preparation of the vector molecule, plasraid 
pURY 528-03 can be cleaved with Hind III and then 
incubated with Klenow DNA polymerase and the four 
dNTP's to generate a blunt-ended, linearized DNA 
molecule (Fig. 15). Subsequently this DNA molecule 
can be cleaved with Sac I and of the two resulting 
DNA fragments the larger can be isolated from a 
0.7% agarose gel and incubated with terminal 
transferase and dATP under conditions described by 
G. Deng and R. Wu [Nucleic Acids Res. 9, 4173-4188 
(1981)]. The time of incubation has to be such that 
the tail added to the 3* end generated by cleavage 
with Sac I has a length of about 20 dATP residues. 

The introduction of a foreign gene into the poly 
dA-tailed expression vector can be carried out by 
incubating the DNA containing this gene with a set 
of restriction enzymes such that the desired gene 
can be cleaved as close as possible upstream of the 
translation initiation codon and downstream of the 
translation termination codon of this gene. Since 
promoter regions can be preceded by transcription 
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termination signals, it is important that the ori- 
ginal promoter is not contained within the resul- 
ting DNA fragment. After purification, the trimmed 
DNA fragment can be equipped with poly dT tails. 

5 

If the gene to be inserted is obtained by reverse 
transcription of an mRNA molecule, the poly dT 
tails can be directly added to the SI nuclease- 
treated double-stranded DNA molecule. In all cases 

10 the time of incubation with terminal transferase 

' : must be chosen such that poly dT tails with a 
length of about 20 nucleotides are generated. The 
DNA to be inserted and the poly dA-tailed vector 
molecule can then be hybridized by incubation at 

15 65°C for 10 minutes, followed by cooling down the 

hybridization mixture slowly to room temperature. 
The mixture can. be subsequently transformed into 
CaCl 2 ^treated E. coli cells. Plasmid DNA isolated 
from ampicillin-resistant trans formants can then be 

20 used to transform yeast cells. 

12. Expression in K. lactis of the preprothaumatin 

encoding gene under control of the proraoter/regula- 
tion region of the glyceraldehyde-3-phosphate 
25 dehydrogenase encoding gene of S. cerevisiae 

(Fig. 16, Fig. 17). 

The E. coli- yeast shuttle vector pEK 2-7 consists 
of plasmid YRp7 [D.T. Stinchcomb et al.. Nature 
30 282, 39-43 (1979)3 containing the 1.2 kb KARS-2 

fragment. Owing to the presence of the yeast trp 1 
gene, plasmid pEK 2-7 can be maintained in K. 
lactis SD11 (lac 4, trp 1; cf. pages 18 and 19 of 
European Patent Application N° 0 096 910; Al). 



35 



To demonstrate the functionality of the promoter/ 
regulation region of the GAPDH encoding gene in 
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K. lactis, plasmid pUR 528-03 (Fig- 8) has been, 
equipped with both KARS-2 and the trp 1 gene- The 
latter two functions were excised from pEK 2-7 by a 
digestion- with Bgi II followed by # the isolation 
from a 0.7% agarose gel of the smallest fragment 
generated. This purified fragment was then inserted 
in the dephosphorylated Bgl II cleavage site of pUR 
528-03 by incubation with T4 DNA ligase. Trans- 
formation of the ligation mix in CaCl 2 -txeated E. 
coli cells yielded plasmid pURK 528-03 (Fig. 16). 
Trans formants generated by the introduction of the 
latter plasmid into K. lactis SD 11 cells by the 
procedure described in European Patent Application 
N° 0 096 910; Al could be shown to synthesize pre- 
prothaumatin (Fig. 17). 

By techniques similar to those mentioned above, 
plasmid pURY 528-03 was also equipped with KARS-2 
and the yeast trp 1 gene and introduced into K. 
lactis SD 11 (Fig. 16 ). Using the same detection 
procedure, K. lactis cells carrying pURK 528-33 
could also be shown to synthesize preprothaumatin 
(Fig. 17). Preprothaumatin production in cells 
containing pURK 528-33 was, however, slightly 
higher than in cells containing pURK 528-03. Since 
similar observations have been made by C. Gerbaud 
et al. [Gene 5, 233;253 (1979)] in the expression 
of the yeast ura 3 gene upon insertion of this gene 
within the coding region "Able" of the 2 micron DNA, 
it is very likely that the enhanced expression of 
preprothaumatin by pURK 528-33 is due to efficient 
transcription termination events in the tran- 
scription termination/polyadenylation region of the 
"Able" operon. This observation indicates that the 
presence of an efficient transcription termination/ 
polyadenylation region downstream of a structural 
gene transcribed by the GAPDH promoter/regulation 
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region is an important factor in optimizing gene 
expression. 

13. Integration of structural genes under the control 
5 of the GAPDH promoter /regulation region into the 

yeast, chromosome (Fig. 18, Fig. 19) . 

Integration of DNA sequences encoding either a 
heterologous or homologous structural gene under 

LO transcriptional control of the GAPDH promoter/ 

regulation region and, optionally, the GAPDH ter- 
mination/polyadenylation region into the yeast 
genome, can be achieved on the basis of techniques 
described by R.J. Rothstein tin Methods in Enzym- 

15 ology 101, 202-211 (1983)] and K. Struhl Cllature 

305, 391-397 (1983)]. The criteria to apply these 
techniques are the availability of (i) suitable 
marker genes, (ii) cloned DNA sequences homologous 
with DNA sequences present on the yeast genome and 

20 Ciii) the availability of an intact, homologous or 

heterologous, protein-encoding DNA sequence. 

Having the marker genes (e.g. leu 2, trp 1, his 3) 
and the protein-encoding DNA sequence (e.g. the 
25 sequences encoding thaumatin-precursors or 

chymosin-precursors) available, the latter 
sequences can be integrated into the yeast genome 
by homologous recombination events either between 
GAPDH promoter and terminator sequences (cf. Fig. 
30 19) or between the marker genes. In the latter. 

approach, which offers a variety of integration 
sites, the foreign protein encoding DNA sequence is 
integrated within the "wild type" marker gene, 
hereby destroying the function of this gene. 

35 

14. Chemical synthesis of structural gene. 

Construction of synthetically chimeric genes. 
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Expression experiments of -the various structural 
genes of preprothaumatin and preprochymosin - 
located on the 2 /urn DNA vector and under control 
of the GAPDH promoter /regulation region and the 
5 GAPDH termination region - in S. cerevisiae 

resulted in an expression yield that was considered 
not yet economically attractive for the fermen- 
tative production of these proteins. The expression 
of preprothaumatin and preprochymosin genes located 
IQ on the KARS-vector did not match economic feasi- 

bility either. 

Another problem observed during expression 
experiments was that preprothaumatin was processed 
15 correctly in both S. cerevisiae and K. lactis into 

thaumatin (Fig- 20) (it is not clear whether the 6 
amino acids on the C-terminus are also removed 
during processing); however, such a correct pro- 
cessing could not be detected with preprochymosin. 

20 

Therefore it is advocated to make preprochymosin 
and preprothaumatin in the preferred codons of 
S. cerevisiae [J. P. Holland and M.J . Holland; J. 
Biol.Chem. 255, 2596-2605 (1980)] to increase the 
25 expression-yield. The syntheses of these genes can 

be carried out according to the methods described 
under 4 and 6. The nucleotide sequences of chymosin 
and thaumatin in a preferred codon usage are given 
in Fig. 21 and Fig. 22. 

30 

As described under 4 and 6, the synthesis can be 
carried out by making small (up to 30 nucleotides) 
single strand DNA fragments with partial over- 
lapping sequences. This method makes it possible 
35 that also the various maturation forms of both 

genes are obtained simultaneously (cf. Fig. 21-22). 
Moreover, the approach adopted is suitable for 
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making changes in parts of the sequences. One can 
use this possibility to replace the leader sequence 
of preprochymosin (nucleotides -174 to - 127 , 
Fig. 21) by various other leader sequences. Two 
typical yeast leader sequences are that of acid 
phosphatase DC. Arima et al . , Nucleic Acids Res. 
11 , 1657-1672 (1983)3 and that of invertase [R. 
Taussig and M. Carlson, Nucleic Acids Res. II, 
1943-1954 (1983)]. Examples of nucleotide sequences 
encoding both leader sequences with codons pre-' 
f erred by yeasts are given in Fig. 23, whereas Fig. 
24 gives schematically two designed chimeric acid 
phosphatase/prochymosin and invertase/prochymosin 
genes. Because preprothaumatin was processed 
correctly by the yeast cells/ we also designed a 
chimeric gene of prochymosin and the leader 
sequence of the preprothaumatin gene (Fig. 24). 
Based on the physico-chemical considerations about 
the nature of the yeast leader sequences and the 
interaction of these leader sequences with the 
signal recognition protein [P. Walter and G. 
Blobel, Proc. Natl. Acad. Sci. USA, 77, 7112-7116 
(1980)], we also designed two consensus leader 
sequences (Fig. 25) which can be used to make 
chimeric genes of these consensus sequences with 
the prochymosin gene (Fig. 24). 
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Legends to the figures 



Fig- 1 Restriction endonuclease cleavage map of a region of plasmid 
pFl 1-33 containing a yeast glyceraldehyde-3-phosphate 
dehydrogenase operon. 

Fig. 2 Nucleotide sequence of the promoter/ regulation region of a 
glyceraldehyde-3-phosphate dehydrogenase operon cloned in 
pFl 1-33. TATA and TATAAA sequences are indicated by solid 
underlining. Presumptive transcription termination signals are 
underlined with dots. Nucleotide sequences between brackets 
- indicate inverted repeats. The nucleotide sequence encoding the 
38 amino acids long peptide is enclosed in a box. 

Fig. 3 Nucleotide sequence of the transcription termination/ 
polyadenylation region of a glyceraldehyde-3-phosphate 
dehydrogenase operon cloned in pFl 1-33. AATAA sequences are 
indicated by solid underlining. Presumptive transcription 
termination signals are underlined with dots. 

Fig. 4 Schematic representation of the insertion of the Eco RI (Dde 1) 

GAPDH promoter/regulation fragment in the preprothaumatin 

encoding plasmid and removal of an Eco RI cleavage site from 
the resulting plasmid. 

Fig. 5 Schematic representation of the construction of (pre) (pro)- 

(pseudo)chymosin encoding plasmids containing the Eco RI (Dde I) 
GAPDH promoter/ regulation fragment. 
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Schematic representation of the structure of the : GAPDH 
promoter/regulation region including potential stem and loop 
structures* Presumptive transcription termination signals are 
indicated hy slashes* The position of the coding sequence for 
the 38 amino acids long peptide on the fragment is shown by ATG 
and TAA codons. 

Fig. 7a Representation of the various steps involved in the preparation 
of the synthetic DNA fragment used to reconstitute the 
original GAPDH promoter/ regulation region upstream of pre- 
prothaumatln encoding nucleotide sequences. 

Fig. 7b Representation of the synthetic DBA fragment used to 

reconstitute the original GAPDH promoter/ regulation region 
upstream of (pre) (pro) (pseud o)chymo sin encoding nucleotide 
sequences. 

Fig. 8 Schematic representation of the introduction of the synthetic 
DNA fragment in preprothauma tin encoding plasmids . 

Fig. 9 Schematic representation of the removal of the Sac I site from 
the reconstituted GAPDH promoter/ regulation region. 

Fig. 10 Schematic representation of the introduction of the synthetic 
DNA fragment in (pre) (pro)(pseudo)chymosin encoding plasmids. 

Fig. 11 General scheme for synthesis of DNA fragments on a polystyrene 
support. 
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Fig. 12 Schematic representation of an Afl II-Bam HI transcription 
termination/ polyadenylation fragment obtained from pFl 1-33 
and its insertion in combination with the M13 central 
transcription termination signal, downstream of the nucleotide 
sequence encoding pseudochymosin 

Fig. 13 Schematic representation of plasmid pMP81. 

Fig. 14 Schematic representation of the introduction of the 2 micron 
DNA. origin of replication and the leu 2 gene obtained from 
pMP81 by digestion with Hind III an Sal I into preprothaumatin 
encoding plasmids. 

Fig- 15 Schematic representation of an E. coli-yeast shuttle vector 
widely applicable for the expression of foreign genes in 
yeast. 

Fig. 16 Schematic representation of the introduction of the KARS-2 and 
trp 1 gene obtained from pEK 2-7 by digestion with Bgl II into 
prepro-thaumatin encoding plasmids. 

Fig. 17 Fluorogram of 0* 5 S) labelled thaumatin-like proteins 

synthesized by K. lactis SD11 cells containing plasmid pEK 2-7 
(lane a), plasmid pURK 528-03 (lane b) and plasmid pURK 528-33 
(lane c) Yeast transformants were grown for 3 hours on a 
minimal medium containing 35 S-cysteine.Cells were collected 
by centrifugation, resuspended in 1 ml 2.0 mol/1 sorbitol, 
0.025 mol/1 NaP0 4 pH 7.5, 1 mmol/1 EDTA, 1 mmol/1 MgCL 2 , 
2.5%/3-mercaptoethanol, 1 mg/ml zymolyase 60.000 and incubated 
for 30 minutes at 30°C. Sphere plasts were then centrifuged 
and lysed by the addition of 270/41 H 2 0, 4^1 100 mmol/1 
FMSF, 8/4l 250 mmol/1 EDTA, 40yCfl 9% NaCl and 80yUL 5x PBSTDS 
(50 mmol/1 NaPO^ pH 7.2, 5% Triton X 100, 2.5Z deoxycholate, 
2,5 % SDS). Immunoprecipitation of thaumatin-like proteins and 
analysis of precipitated proteins was carried out as described 
by L. Edens et al, Gene 18, 1-12 (1982). 
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Fig. 18 Schematic representation of plasmid pEK 2-7. 

Fig. 19 Schematic representation of a plasmid which can be used for the 
insertion of a pseudochymosin-encoding nucleotide sequence 
under transcriptional control of the GAPDH promoter/ regulation 
and the GAPDH termination/ polyadenylation region into the yeast 
chromosome. 

Fig. 20 Fluorogram of(^S) cysteine labelled thaumatin-like protein 

synthesized by S. cerevisiae AH 22 cells containing plasmid pURY 
528-03 (lane c) or sythesized in vitro in a wheat germ protein 
synthesizing system under the direction of mRNA purified from 
arils of Thaumatococcus fruits (lane b) . Lane a shows 
radioactive marker proteins (obtained from Amersham) and lane d 
shows the position of thaumatin II isolated from the arils of 
Thaumatococcus fruits. Lysis of yeast cells "was carried out 
as described in the legend of fig. 17 ♦ Wheat germ translation 
of mRNA and immuno precipitation procedures were carried out 
as described by L. Ed ens et al, Gene 18, 1-12 (1982). 

Fig. 21 Nucleotide sequence of the gene encoding (pre)(pro)(pseudo) 
chymosin in cod on usage preferred "by S. cerevisiae. 

Fig. 22 Nucleotide sequence of the gene encoding (pre) (pro) thaumatin 
in codon usage preferred by S. cerevisiae. 

■ M 

Fig. 23 Nucleotide sequence\the genes encoding the leader sequences 
acid phosphatase and invertase In codon usage preferred 
by S. cerevisiae. 

Fig. 24 Schematic representation of the. construction of the gene 

encoding prochymosin provided with the leader sequences of 
the invertase, acid phosphatase or thaumatin encoding genes, 
or with the two designed consensus leader sequences. 

Fig. 25 Nucleotide sequences of the two designed consensus leader 
sequences. 



WO 84/04538 



PCT/EP84/00153 



CLAIMS 

X # A DNA sequence capable of initiating trans- 

cription by yeast RNA polymerase II which includes at 
least part of the regulon region of one of the GAPDH 
genes of S. cerevisiae , characterized in that it com- 
5 prises a DNA sequence essentially as given in Fig. 2, 
and wherein the regulon region is optionally modified 
to include at least one restriction enzyme cleavage 
site, to facilitate manipulation of the nucleotide 
sequence region for protein synthesis initiation. 

10 

2. A DNA sequence capable of both termination of 
the transcription by yeast RNA polymerase II and ef- 
fecting polyadenylation of the mRNA, which includes at 
least part of the termination/polyadenylation region 

15 belonging to one of the GAPDH genes of S. cerevisiae, 
characterised in that it comprises a DNA sequence 
essentially as 'given in Pig. 3. 

3. DNA sequence, which can be inserted into a re- 
20 combinant DNA plasmid or into a yeast chromosome,, com- 
prising: 

(a) a DNA sequence according to claim 1, and 

(b) one or more structural genes different from the 
GAPDH genes of S". cerevisiae , and at least two of 

25 features (c)-(f), 

(c) one or more specific DNA sequences capable of ter- 
minating the transcription by yeast RNA polymerase 
II and effecting polyadenylation of the mRNA, 
and/ or 

30 (d) one or more selection markers, and/or 

(e) either one or more nucleotide sequences allowing a 
stable insertion in a chromosome of yeasts or one 
or more DNA sequences which regulate DNA repli- 
cation in yeasts belonging to the genus Saccharo- 

35 myces or to the genus Kluyver omy ces , and/or 
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(f ) a DNA sequence encoding a signal polypeptide of not 
more than 30 amino acid residues assisting the 
translocation of proteins, which DNA sequence is 
situated upstream of and in the same reading frame 
5 as the structural gene. 

4. DNA sequence according to claim 3, character- 

ized in that the structural gene of (b) encodes thau- 
matin or chymosin, or their various allelic or modified 
10 forms, which modified forms do not impair the sweet- 
tasting properties of thauraatin or the milk-clotting 
properties of chymosin, respectively, or precursors of 
these thaumatin-like or chyraos in-like proteins. 

15 5. DNA sequence according to claim 3, charac- 

terized in that the specific DNA sequence of (c) is a 
DNA sequence essentially as given in Fig. 3. 

6. DNA sequence according to claim 3 f charac- 

20 terized in that the DNA sequence of (f) encoding a sig- 
nal polypeptide is selected from the group consisting 
of DNA sequences encoding 

(a) the signal polypeptide translocating £. cere vis lae 
invertase, namely Met. Leu. Leu. Gin. Ala. Phe .Leu. Phe.- 

25 Leu . Leu . Ala . Gly . Phe . Ala . Ala . Lys ..lie . Ser . Ala ; 

(b) the signal polypep-tide translocating S. cerevisiae 
acid phosphatase, namely Met. Phe. Lys .Ser. Val. Val.- 
Ty r . Ser . 1 1 e - Leu . Ala . Ala . Ser . Leu . Ala . Asn . Ala ; 

(c) the signal polypeptide of unmatured . forms , of 

30 thaumatin-like proteins, namely Met. Ala. Ala. Thr . - 

Thr - Cys . Phe . Phe . Phe . Leu .Phe . Pro . Phe . Leu . Leu . Leu . - 
Leu . Thr . Leu . Ser * Arg . Ala ; 

(d) the signal polypeptide of unmatured forms of chym- 
osin-like proteins, namely Met .Arg. Cys. Leu. Val. - 

3 5 Val . Leu . Leu . Ala . Val . Phe . Ala . Leu . Ser . Gin . G ly ; and 

(e) two consensus signal polypeptides, namely 

Met. Ser. Lys. Ala. Ala. Leu. Ala. Phe. lie. Ala. Phe. Val. - 
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Ile.Val.Leu. Ile.Val .Asn. Ala and 

Met: • Ser . Ly s . Phe . Val. lie. Val. Leu • Ile.Val. Ala. Ala . - 
Leu . Ala . Phe .lie. Ala . Asn . Ala . 

7. DNA sequence according to claim 3, charac- 
terized in that either the codons of the structural 
gene of (b) or the codons of the signal polypeptide- 
encoding DNA sequence of (f), or both, are modified 
into codons preferred by yeasts. 

8. DNA sequence according to claim 7, charac- 
terized in that as codons preferred by yeasts the 
following codons are used: 



20 



GCC 


or 


GCT 


alanine 


TTG 






leucine 


AGA 






arginine 


AAG 






lysine 


AAC 






asparagine 


ATG 






methionine 


GAC 


or 


GAT 


aspartic acid 


TTC 






phenylalanine 


TGT 






cysteine 


CCA 






proline 


CAA 






' glutamine 


TCC 


or 


TCT 


serine 


GAA 






glutamic acid 


ACC 


or 


ACT 


threonine 


GGT 






glycine 


TGG 






tryptophan 


CAC 






histidine 


TAC 






tyrosine 


ATC 


or 


ATT 


isoieucine 


GTC 


or 


GTT 


valine. 



25 9. Process for preparing yeasts containing rDNA 

sequences, characterized in that DNA sequences as 
claimed in claim 3 are introduced into Saccharomyces , 
Kluyveromyces , Debaryomyces , Hansenula , Candida , 
Torulopsis or Rhodotorula yeasts, either in the form < 

30 plasmids or by incorporation in the yeast genome. 

10. Yeast containing an rDNA sequence as claimed 

in claim 3, either in the form of a plasmid or in- 
corporated in the yeast genome. 
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11. Process for preparing a protein by culti- 
vation of a yeast containing rDETA sequences,, charac- 
terized in that a yeast as claimed in claim 10 is used 
to produce a protein, whereby the rDNA sequence coh- 

5 tains a structural gene encoding that protein or a 

precursor thereof which precursor during processing can 
form the relevant protein. 

12. Protein, produced by a process according to 
10 claim 11. 
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Fig 2. 
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Fig.3. 



7 17 27 37 47 57 

TAAATTTAAC TCCTTAAGGT TACTTTAATG ATTTAGTTTT TATTATTAAT_AATTCATGCT 

CATGACATCT CATATACACG TTTATAAAAC TTAAATAGAT TGAAAATGTA TTAAAGATTC 
127 137 147 157 167 - 

CTCAGGGATT CGATTTTTTT GGAAGTTTTT GTTTTTTTTT CCTTGAGATG CTGTAGTATT 
1A7 197 207 217 ££t 

tgggaacaat tatacaatcg aaagatatat gcttacattc gaccgtttta gccgtgatca 
ttatcctata gtaacataac ctgaagtata actgacacta ctatcatcaa tacttgtcac 
atgagaactc tgtgaataat taggccactg aaatttgatg cctgaaggac cggcatcacg 
:gat aaagcactta. 

GGTCATTTCC AAGTATTG*TT TCCAAGCATC GTACCTTTCA CCATTTGGAG TATCACTTAG 
487 497 507 517 527 *« 



TATCTTCGAT AAAGCACTTA.^GTATCACACT AATTGGCTTT TCGCCGCATA TGGTGTTTCC 
. — / -» f 44 7 4 57 *» o / 



CGTTTTCATC GCATATCTGT. CCATTATTTC AATGGATTGC CAAATGGGAA CTTGATGATG 

557 56 7 57 7 3»' 

CTCCTAGCAG TTAACA* 
617 

CTTGCTTACC GACGTA< 
AATCACTGCT TACAATgHLaAATTGTTCG GATCCTTAAT GTACTCCGAC AAAATATTAC 



TGAAAGTTTA CTCCTAGCAG TTAACATTTC CACTTCTGTT TCCTCTTTAA TGGCATTCAT 

TCAACTCWC CTTGCTTACC GACGTACCCG TATATTGGAA TCTGCGGCCC CAATGACACA 

577 687 697 /U/ 



727 T37 747 757 767 

IACG 

tatctaaIgI aatttctctc catttcaaag TTTCCACCAA CATGCGGAGC tgcatctcta 

847 857 867 877 88/ 



caatgcaacg atcaacatca acgctgttat gagaaaccat catgggaatt accttcaccg 

807 817 827 OJ/ 



AGGAATGTTC AGCCATATCA GTGTCATGAT CCATTGGCTT AAACAGCTTC TTTCCGTTCT 
CAGGATACTC CTTCTGTATT AATGTTTTAC ACAAGTCTGT ATCCACTTTC AGATTACCCA 



907 917 927 937 947 

ICTC CTTCTGTATT AATGTTTTAC 

AGGGCGTCTC TAGCTCACTG AATGCACTaI CTAAAATTTG GTTTTTGAAA TAGATGTGAT 



t«vi» ISiT 1057 1067 

ccgacggScc caagataaat attctcttaa cattacggtt caaatccaac gatgcgtacg 

AGTAGGCCAT AGTGGGTCCA CAATACCT^T AACCGGCATG AGGACATATG ATAATTCTGG 
11A7 1157 1167 1177 1187 u " 

CGTTGTGAAT TGGGCCTTTA AGGGTACTTT TGATCAAGTA TGTATGCGGT TGTTGAGATA 
t 7fi7 1217 1227 1237 1247 

ATTCTTGGGC TCTATTGGAA TACCATGAGC CTGCATGTGT TGCTGGACGT ATTGACATGT 
\ *>kl 1277 1287 1297 1 J u / - 

TTGAAAAATT CTATTCTTTG CACTGTAGTC CACCTAAGCC ACCGACTAGG ACCACTTCAC 

1322 
TTAAG 
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pFll-33 
contains GAPDH gene 
in 9 kb insert 



Sal 1 



isolate 4.3 kb GAPDH 
promoter containing fragment 

I Dde I 

isolate Pvu II site-containing 
promoter fragment 



Ddel Bglll Pvull Ddel 



Klenov Pol 
dNTP ' s 




EcoRI linker 
— > 

T/ DNA lipase 



Mlut 



H/ndtlt 



Salt 



Fig A 



Ddel 



Ddel 



EcoRt[ Bglll PyuII\EcoRI 



EcoRI 



promoter fragment 



EcoRI 



Alk. phos. 



T- "JNA ligase 



Hung bean nuclease partial EcoRI 
^ Alk. phos. 



Ddel 
Pvu If 

\ \EcoRI 
Bglll \ 
Dde A 
£co/?A 



^ T; DNA li 



gase 
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Fig 5. 
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Fig. 7a. 



Sac I 

5' CCC.TTA.GTT.TCA.AAT.TAA.AGA.GCT.CAT.CAC 3» 

3* TCT.CGA.GTA.GTG.TGT.TTG.TTT.GTT.TTG.TTT 5' 



Dde I 



J 

Sac I 



Klenow DNA-polyraerase 
dNTP's 



5' CCC . TTA. GTT. TCA. AAT . TAA. AGA . GCT . CAT . CAC ♦ ACA- AAC .AAA. CAA . AAC . AAA 3' 
3' GGG • AAT . CAA • AGT • TTA . ATT . TCT . CGA . GTA . GTG . TGT . TTG • TTT . GTT . TTG . TTT 5' 



Dde I 



Sac I 

5« TTA. GTT . TCA. AAT . TAA. AGA. GCT . CAT. CAC . ACA. AAC . AAA. CAA . AAC . AAA 3* 
3' CAA. AGT. TTA. ATT. TCT. CGA. GTA. GTG. TGT. TTG. TTT. GTT. TTG. TTT 5 1 

Sac I 

T 4 DNA-polytnerase , dNT? ' s 
T 4 DNA ligase 

5' TTA. GTT. TCA. AAT. TAA. AGC.ATC. ACA. CAA. ACA. AAC. AAA- ACA. AA 3' 
3' CAA . AGT . TTA . ATT . TCG . TAG . TGT . GTT . TGT . TTG . TTT . TGT . TT 5' 



Fig.7b. 

5 ' A . GCT . CAT . CAC . ACA. AAC .AAA. CAA. AAC. AAA 3' 
3' TA. GTC . TGT . TTG. TTT. GTT .TTG . TTT 5' 
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1 



EcoRI 

Mung bean nuclease 



Bgl II 
Alk. phos. 



isolate large fragment 



pUR 528-02 

§ Bgl II 
Dde I 



isolate GAPDH- 
promoter fragment 



synthetic 48-mer 
Dde I 



T A DNA ligase 



isolate large fragment 



Dde I Sac f 



Bgllt 
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Ddel Sad 




IT, DNA ligase 

Ode I Sac I 
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pFl 1-33 



Afl II 



K13 RF 

ITaq I 
SI -nuclease 



Aflll 



Hpai BamHIAflH 



51131s? 



jHi«d III linker 
IT/ DNA Ligase 
J^ind III 

isolate fragment containing 
gene VIII - gene III junction 



« ^ Bam HT 
p 322 Alk. phos. 



Bam HI linker 
T4 DNA ligase 
Bam HI 



1 



Sau 3A 



pBR 322 



isolate fragment containing central 
transcription termination signal 



Hind III 
EcoRI/ .Bam HI 




•Bam HI 



Bam HI 



Hind III 
Bam HI 
Alk. phos. 



T 4 DNA ligase 



^isolate large 
"fragment 



EcoRI Hind III m ^ 
,Sau3A/BamHI 



1 



Hind III 
partial Bam HI 



isolate fragment containing 
termination/polyadenylation fragment A 
pUR 1521-01 (unmethylated) 

I 

Bel I 
I Sal I 
I Alk, phos. 

isolate large fragment ~ 




Pvufl\EcoRI £coRI 




Bel I /Bam HI 



Sail 



oun 
\fib^ wipo iV y 
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fig. ft. 




pMP 81 

IHind III 
Sal I 

isolate fragment containing 

2 micron replicon and leu 2 gene 



' T. DNA ligase 




IHind III 
Sal I 
Alk. phos. 

isolate large fragment 
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Fig. 16. 



pUR 528-03 

Bgl II 
Alk. phos. 



T A DNA ligase 



pEK 2-7 
isolate small fragment 



Ddet 



Pvu/f 



Bgl II 




Sail 



pEK 2-7 
Bgl II 



pURY 528-03 




Bgl II 


Alk. phos. 


folate small 


T 4 DNA ligase 




1 


t 





Ddel 




EeeRI 
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pUR.1521-12 

+ Bam HI 
+ alk. phos. 



+ lh DNA 
1 igase 



isolate fragment 
■containing Trpl gene 



+ -W 11 plasmid 
^ ■ pEK 2-7 



+ Bam HI 



Ddel nl 
PvulllEcoRI £coRI 




Bell /Bam HI 



Sail 



BamHI/Bglll 



Hind III 



BamH I/Bam HI 



+ Bam HI 

+ alk. phos. 



+ Bam HI D i asm {d 

isolate fragment ^ 5il-i7 

containing fragment B ^ ■ / 



+ Jk DNA 
1 i gase 



▼ 

Ddel „, 
Pvull\EcoRI EcoRI 



Ddel 



*£ll J 



llndlll 
3am HI 




Bcll/BamHI + Bgl 1 1 introduce fragments into trp-1" 
yeast and select for trp + 



+ Hind 1 1 1 



BamHI/Bglll 



Bam Hl/Bam HI 



Fig.19 
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Fig. 20. 
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Fig. 21. 

LEADER SEQUENCE PREPROCHYMOSIN IN S. CER. PREFERRED CODONS • 

ATGAGATGTT TGGTTGTTTT GTTGGCTGTT TTCGCTTTGT CCCAAGGX 
-174 -165 "155 -H5 -135 -127 

DNA SEQUENCE PRO PART PRECEEDING PSEUDOCHYMOSIN IN S. CER. PREFERRED CODONS 
GCTGAAATTA ctagaattcc attgtacaag ggxaagxccx XGAGAAAGCC TTTGAACGAA 
- 12 6 -116 -106 -96 " 

CACGGTTTGT XGGAAGACTX C 
-66 "56 -«6 



DNA SEQUENCE PSEUDO PART PRECEEDING CHYMOSIN IN S. CER PREFERRED CODONS 

XXGCAAAACC AACAATACGG TATTTCCTCC AACIACXCCG CXXXC 
-45 -36 -26 -16 ~ 6 "1 



DNA SEQUENCE CHYMOSIN IN S. CER. FREFERED CODONS 

GGTGAAGTTG CXXCCCXXCC AXTGACIaK XACIICCA^ CCCAAXAcS CCCXAACA^ 
XACXIGGGIA CXCCACCACA ACAAIICA X GXXXXGXXCG ACACXGGXXC CXCCCACXXC 
TG GCXXCCA°X CCAXXXA^ S XAAGXCC 2 GCIXCTAAGA ACCACCAAAG AYXCGACCCA 
A GAAAGXC 9 C°X CCACXIXCCA AAACXIGGGX AACCCATXCJ CCAXXCACXA CGGXACXGGX 
tC CAXGCAAC GXAXXXXGGO XXACCAC CX CXXACICXTX CCAACAXXCX IGACATTCAA 



CAAACXGXXG .TTTCTcSS XCAAGAACCA CCXCACCXIT XCACXXACGC XGAAXXCGAC 
GOXAXXXXGG CXAIGCCTXA CCCAXCC X0 GCXXCCGAAX ACXCCAXTCC AGXXXXCGAC 
aacaxgaxga acagacacx? ggxxgcx a CACXXCIXCX CCGXXXACAX GGACAGAAAC 



GCTCAAGAAl CCAXOXX^C XXXGGGX AXXGACCCAX CCXACXACAC ICCXXCCXXC 
CACXGGGXXC CAGXXACXGX XCAACAA AC XCCCAAXICA CXCXICACXC CGXXACXAIX 
XCCGGXGXXG XXGXXGCXXG XGAAGGX GX XGXCAAGCXA XXXXGGACAC XGGXACXXCC 
AAGXXGGXXG CXCCAtc'c CCACAXX AACATXCAAC AAGCXAIXGG XGCXACXCAA 
aaccaaxaS cxgaattcca CAXXGACXGX GACAACXXGX CCXACAXCCC aacxcxxgxx 
xxccaaaxxa accgxaaca gxaccca acxccaiccg cxxacacxxc ccaagaccaa 

OOXXXCXOxi CXXCCGGXXX CCAATCC AACCACXCCC AAAAGXGGAT XXXGGGICAC 
GTTTTCATTA GAGAAXACXA CXCCCXXXxS GACAG AGCXA ACAACXXGCX XGGTTTCCCT 
9 69 

AAGGCTATT /^JR£^^ 

f _ OMPI 

v$K wipo 
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Fig. 22. 



LEADER SEQUENCE OF PREPROTHAU MATIN IN S.CER. PREFERRED CODONS 
ATGGCIGCTA CTACTTGTTT CTTCTTCTTG TTCCCATTCT TGTTGTTGTT GACTTTGTCC 



-66 -57 

AGAGCT 
-1 



DNA SEQUENCE MATURE THAUMATIN IN S. CER. PREFERRED CODONS 



10 20 30 40 50 60 

XC 

90 

:aa 

150 
kTT 
110 

;ac 

270 
TTC 
330 
GTT 
390 
ATI 

GGAGGKgSg 6TTCTAACCA CGG^TGTACT GTTTTGgIaa CTTGGGAATA CTGTTGTACT 



GGTAGTTTGG AAATTGTT.A GAG.TGTTGG TAGAGTGTGT GGGGTGGIGG ITGGAAGGG^ 

- ~ 80 90 *• uu - 

CAA 
150 
ATT 
210 
GAC 
27C 
.TTC 
33C 

:gti 

v»»w*»* * ~ o «nn 400 * IU 

, tM «S TTAGATG^GG TGGTGAGAJT GTTGGTGAAT GTCCAGCTAA GTTGAAGGCT 



GAGGGTGG^ TC«C C «oS TGCTAGACAA TTGAACTCCG GtGAATCGTG GACTAT.TAAC 

;att tgggctagaa 

210 220 
TGAC TGTGGTGGTT 
270 ^ 28C 

GGTAGACCAC WAUAuiii „„w^-ATTC TCCTTGAACC 350 350 

310 III <rfTriircTT CCAATGTACT TCTCCCCAAC TACTACAGCT 

CACATTTCCA ACATTAAGGG TTTCAACGTT CCAATGTACT ^ ^ Q 



GTTGAACCAC GT.GTAAGGG TGGTAAGATT tGGGGTAGA. GTG.CTGTTA GTTGGAGG.G 

xgcggtag" gtatttgSag aagtggtgag tgtggtggtt tgttgcaatg o TAAGAGATTG 
ggtagaccac caagtagSt gggtgaattg tggttga.gg aatagggtaa ggactagat. 



agtggta'ag? gtggtggaag tgaatagIgg acattcttca agagattgtg tcgagagggt 
ttgtggiacg ttttcgacaa gccaagtagt gitacttgtc caggticcic caactacaga 

610 620 
GTTACTTTCT CTCCAACTGC T 

DNA SEQUENCE ACIDIC PEPTIDE OF FROTH AUMATIN IN S.CER. PREFERRED CODONS 

622 632 
TTGGAATTGG AACACGAA 



OMPI 



WO 84/04538 ? ' ■ . PCT/EP84/00153 



LEADER SEQUENCE ACIDIC PHOSPHATASE IN YEAST PREFERRED CODONS 

ATGTTCAAGT CCGTTGTTTA CTCCATTTTG GCTGCTTCCT TGGCTAACGC T 
-51 -41 -31 -21 -11 -1 



LEADER SEQUENCE INVERTASE IN YEAST PREFERED CODONS 

ATGTTGTTGC AAGCTTTCTT GTTCTTGTTG GCTGGTTTCG CTGCTAAGAT TTCCGCT 
-57 -47 -37 -27 -17 -7 



Fig.23. , 

i 
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Fig.25. 



CONSENSUS LEADER SEQUENCE (A-B> IN YEAST PREFERRED CODONS 

ATGTCCAAGG CTGCTTTGGC TTTCATTGCT TTCGTTAITG TTTTGATTGT TAACGCT 

-38 "28 ~* ° 



-48 



CONSENSUS LEADER SEQUENCE ( B— A) IN YEAST PREFERRED CODONS 
ATGTCCAAGT TCGTTATTGT TTTGATTGTT GCTGCTTTGG CTTTCATTGC TAACGCT 



-48 
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